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Visual Servoing and Motion control of a robotic 
device in inspection and replacement tasks 
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Abstract — This paper presents a generalized framework of 
an image-based visual servoing of an articulated arm which has 
to be deployed inside a simulated reactor vessel environment. 
Wall tiles of the vessel idealized as rectangular grids on a surface 
are to be inspected and an attempt is made to replace the 
damaged tiles during shutdown periods of the machine. The 
vision sensing methodology of the proposed arm is explained. 
The arm has a camera located at the wrist (eye-in-hand) and the 
control action has to be taken place at joint level. The 
preliminary results are only illustrated. 

Index Terms — In vessel inspection, Kinematics, Manipulator 
deployment, Serial robot, Visual-servoing. 


I. INTRODUCTION 

In recent years, a wide variety of applications regarding 
autonomous robot behaviour in unknown environments have 
been developed. The new generation robots are adapted to 
changing conditions in real time. Such behaviour is necessary 
especially when facing difficult tasks in practice like search 
and rescue missions, reconnaissance, surveillance and 
inspection in complex and dangerous surroundings. As an 
example, remote handling robots used in inspection and 
maintenance of in-vessel components of fusion devices 
require a non-contact robust sensing system. In such 
instances, the robot vision is crucial, since it mimics human 
sense and allows for noncontact measurement from the 
environment. The control inputs for the robot motors are 
produced by processing image data (like extraction of 
contours, features, corners and other visual primitives). Basic 
purpose of visual control is to control the pose of the robot’s 
end-effector relative to a target object or a set of target 
features. Visual servoing or visual servo control (VSC) 
involves various techniques from image processing, computer 
vision and control theory. Using such an approach, systems 
with low cost sensors and actuators can be developed. In 
VSC, the information from camera is used within the control 
loop to position the tracking device as per the requirement. 
The vision data may be acquired either from the camera that is 
placed directly onto the manipulator (eye-in hand) or at a 
fixed location in the scene (eye-to hand). The features on the 
image plane are servo-controlled to their goal positions. 
There are two traditional approaches among all the 
vision-based control schemes [1]: (i) position-based VSC and 
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(ii) image-based VSC. In position-based system, the control is 
performed in task-space based on the three-dimensional 
information retrieved from the image. Here, the camera pose 
is estimated using visual information and the control design is 
a classical state-space design. The quality of the response 
depends on the quality of the pose estimation and makes the 
control sensitive to camera calibration errors. In an 
image-based system, feedback is defined based on 
image-features and controller is designed to drive the image 
features towards a goal configuration. Thus, it implicitly 
solves the Cartesian motion planning problem. The approach 
is therefore, relatively robust to camera calibration and target 
modelling errors. Image-based approaches exploit basically 
2D visual measurements such as points or lines tracked in the 
image during task execution. 

Robot has several links and joints and each requiring a 
positioning reference in relation to the predefined origin 
point. The vision system defines image coordinates based on 
where the camera points-to without regard to a fixed reference 
origin. Pixel locations within an image frame must coincide 
with the corresponding robot coordinates in order for proper 
visual robotic guidance. Several works were reported in 
literature relating vision guided robotic systems. Early in 
1985, Sanderson et al.[l] proposed an adaptive control 
approach for the nonlinear time-varying relationship between 
the robot pose and image features in image-based servoing. 
They described detailed simulations of image-based visual 
servoing for a variety of 3 -degree of freedom manipulators. 
Seaden and Ang [2] worked-on relative target-object 
(rigid-body) pose estimation for vision-based control of 
industrial robots. They developed and implemented a 
closed-form target-pose estimation algorithm. Feddama [3] 
applied an explicit feature space trajectory generator and 
closed-loop joint control to overcome problems due to low 
visual sampling rate. Here, an experimental work based on 
image visual servoing of a 4-degree of freedom robot was 
presented. Hashimoto et al.[4] also illustrated simulations for 
comparing position-based and image-based approaches. 
Korayem et al.[5] designed and simulated vision-based 
control and performance tests for 3-P robot by visual C++ 
software. They used a camera which was installed on 
end-effector of the robot to find a target. A feature-based 
visual servoing control on the end-effector was used to reach 
the target. Jara et al.[6] employed Java for developing an 
interactive tool for industrial robot simulations. Pinto et al.[7] 
proposed a eye-on-hand system, where the use of cameras will 
be replaced by the 2D laser range finder, which is attached to 
a robotic manipulator executing predefined path to produce 
grayscale images of workstation. Fang et al.[8] proposed 
augmented realty in programming a robot for trajectory 
planning and transformation into task-optimized executable 
robot paths. Therefore, the impact of pose estimation in visual 
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servoing, where the relative pose between a camera and a 
target can be used for real-time control of robot motion is a 
topic of present interest. 

In a fusion reactor, the first wall inside a shielded-blanket is 
a basic in-vessel component that often affected by the plasma 
strokes. In this regard, the tiles of first wall are supposed to 
withstand the intense flux of energetic particles (hydrogen 
isotopes and neutrons) as well as heat loads. It requires 
frequent inspection of wall tiles during shutdown periods. 
Remote in-vessel inspection and guided robotic systems are 
required in this regard. Several earlier works [9-13] have 
illustrated the implementation issues of robots in fusion 
reactor vessels with ITER standards. Designing such a robotic 
system involves multiple modules such as: flexible 
manipulator mechanism that advances freely into the ports of 
the vessel, gripper designing for handling the wall-tiles, 
vision-based inspection scheme for monitoring, as well as the 
remote control of joints as per the requirements. In the present 
work, vision module is presented for this specific application. 
The proposed 7-degrees of freedom articulated redundant 
robot manipulator configuration is first explained. Kinematics 
issues are briefly outlined. Vision sensing methodology in the 
proposed manipulator is described. 

II. Description of robotic manipulator 

The manipulator considered in present wok is an articulated 
serial redundant platform. It can be controlled by a teach 
pendant or a joy-stick device. It has a sturdy base that can be 
moved on rails and locked at a particular position. Further, 
there is a waist which could be swivelled about vertical axis 
just like other industrial commercial arms. This is controlled 
by a high torque DC motor through metallic gear train. At the 
end of waist, there is a shoulder joint which is driven by 
another DC motor through belt-transmission. This link is 
further connected in succession with two more links as shown 
in Fig. 1. 


torque-rating motor mounted at the wrist. The end-effector is 
a two-state gripper with rubber pads and is operated by a 
worm-wheel based four bar mechanism. As the gripper will be 
activated after end-effector is aligned with target point, this 
translational degree of freedom is generally not considered in 
overall degrees of freedom of the manipulator. The visual 
sensor in this work is a digital camera mounted before the 
end-effector to monitor the target motion in a 3 -dimensional 
(3D) workspace. The camera is assumed calibrated and the 
intrinsic and extrinsic parameters, such as the focal length, the 
physical size and resolution of image sensor, the 
transformation matrix between the camera and the 
end-effector, are known. 

A. Kinematic Model 

Kinematic model refers to the methodology of deriving the 
relationship between joint angles and the end-effector pose. 
Conventional Denavit- Hartenberg (D-H) notation of link 
frames is adopted and parameters are first identified. The link 
homogeneous transformation matrices are first obtained from 
the table of known and unknown variables. Fig. 2 shows the 
kinematic link frames considered for further analysis. 



Fig. 2 Kinematic model of the manipulator 
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The D-H parameters of the manipulator are shown in Table-I. 

TABLE I 

D-H Parameters of the manipulator 


Link 

0 

d 

a 

a 

Joint limits 

1 

0 

d, 

0 

0 

0-800 mm 

2 

02 

d 2 

0 

-n/2 

-170° to 170° 

3 

03 

0 

a 3 

0 

-60° to 90° 

4 

04 

0 

a 4 

0 

-45° to 70° 

5 

05 

0 

a 5 

0 

-45° to 70° 

6 

06 

0 

0 

-n/2 

-170° to 170° 

7 

07 

0 

0 

0 

-100° to 100° 


The transformation matrix of the coordinate system (frame) /, 
represented in frame /-I can be written as: 


Fig.l Schematic of the P6R manipulator system 

The end of the final link is a place for joining with a wrist, 
which is controlled by joint motors facilitating the 
end-effector that holds a tool to advance to a required posture. 
In fact, the wrist pitch is controlled by a similar DC motor 
located at the base, while roll is guided by a relatively smaller 
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where, 0i, d i? a i? are the D-H parameters of link i as shown in 
Table I, with c=cos and s=sin. The link lengths 
a3=a 4 =a 5 =240mm and waist height di=300mm, The overall 
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forward kinematics [° 7 T] =| °]T | |i,T 1 1 2 ,T | . . . [ | of the 

manipulator can be easily obtained by multiplying individual 
link transformation matrices. This can be done with symbolic 
programming available in MATLAB/Maple environment. In 
control task, the joint motors are actuated as per the sensing 
information available from Cartesian space. Inverse 
kinematics therefore, computes the required joint angles when 
the pose of the end-effector is supplied (see appendix). The 
Jacobian matrix describing the relationship between the joint 
angular velocities and the corresponding end-effector linear 
velocities can be then obtained with the method of differential 
transformations. 

B. Frames of Reference 

It is assumed that a camera is rigidly mounted over the wrist 
and that the object is placed in the camera’s field of view as 
shown in Fig. 3. 



Fig. 3 Frames of Reference 

The relevant four coordinate frames are: the object frame { O } 
centered at the object, the camera frame {C} centered at the 
camera lens, the manipulator hand frame {//} centered at the 
robot end-effector and the robot base frame {R} centered at 
the robot base. In practice, locating an object with respect to 
the robot base requires: (i) camera calibration, which 
describes the relative position and orientation between the 

object and the camera [qT] (ii) hand-eye calibration, which 

describes the relative position and orientation between the 

tt 

camera and the robot hand [ C T] and (iii) robot calibration, 
which is the manipulator kinematics relating hand frame with 

n 

respect to base [ H T] . Given a point P on the object (in 

homogeneous coordinates), the point described in the robot 
frame is given by 

{ R P}=[oT]{°P} (2) 

where 

[oT]=[qT][cT][hT] (3) 

is a 4x4 homogeneous Euclidean transformation between the 
object and robot coordinate frames. 

III. Introduction to image-based visual servoing 

A visual servoing task is in fact minimization of an error 
vector defined in image plane. When a number of image 
features on the image plane are given, it is required to select a 


set of generalized image coordinates to characterize them. Let 
s(q(t), X) be the generalized image coordinates with X 
representing the geometric parameters associated with the 
features in the 3-D space. Then, the error vector is defined as: 

e(t)=s(q(t),/.)-s* (4) 

Here, s* is the desired feature information vector. The 
definition of parameter vector ‘s’ determines the visual servo 
control scheme. To design visual servo controller, a 
relationship between the time derivative of s and camera 
velocity v c is first determined as follows: 

i = L s v c (5) 

Where, L s is called image Jacobian matrix. Now, the 
relationship between the camera velocity and error vector is 
obtained by considering s* to be constant parameter (due to 
fixed goal pose) as follows: 

e = L s v c (6) 

In order to decrease the error, e = —Ae is chosen , resulting in 

v c =-AL + s e (7) 

Here, is pseudo inverse matrix of L s , which cannot be 

calculated in real conditions and hence an approximation 

* , 

L s is often used. Fig.4 shows a relationship between the 
camera frame and the image frame. 



A 3D point P can be projected into the image plane as a 2D 
point using the perspective projections as: 


X 

x = — 
Z 
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U~Uq 
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(B) 

(9) 


The parameters u 0 , v 0 , / are called the camera intrinsic 
parameters. The velocity of 3D point referring to camera 
frame is: 


P = -V c -G>c xP 


( 10 ) 


Where v c and co c is instantaneous linear and angular 

velocities of camera. Here, P=[x y z] T is a point on 3D object. 
Using eqs.(8) and (9) along with eq.(10), we can simplify and 
write down the relations as: 


X=L x V c (11) 

Where X=[x y] T and V c =[v x v z co x coy co z ] T are vectors. The 
matrix L x is image Jacobian given by [14]: 
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( 12 ) 


Here, the depth Z should be accurately measured; otherwise 

/\ 

an estimate L x should be used. 


IV. Proposed methodology 

The control problem can be defined as follows: Design a set 
of joint trajectories, such that the end-effector can move to a 
desired grasp position using the available pose of target in the 
workspace obtained from vision system. Fig. 5 shows the 
workspace of the manipulator. The assembly of articulated 
robotic manipulator is tested in CATIA and the parts are 
exported to ADAMS VIEW (R2013). The various joints are 
applied to visualize the kinematic simulations. 



Fig. 5 Work volume at the end-effector 


Various steps involved in the process of vision-based 
monitoring are: (i) digital image acquisition (ii) 
pre-processing (filtering etc) (iii) feature extraction (to 
acquire import image features like lines, edges) (iv) 
detection/segmentation (to determine image points or regions 
for further processing) and (v) decision making. The 
automated tile-inspection procedure is explained below: (a) to 
capture the image, a digital camera (B ASLER acA640-90gm) 
with the Sony ICX424 CCD sensor delivering 90 frames per 
second at 1.4 MP resolution is employed. It has 659 
(Horizontal) x 494 (vertical) pixels. Fig. 6 shows the camera 
used. It can be interfaced with PC using Ethernet port and 
operates with a 12 V DC power supply. It is to be mounted at 
the end of arm and care has to be taken to reduce image 
blurring. 



Fig. 6 Digital camera employed 


Image processing techniques like median filtering, 
contrasting, brightness and edge sharpening have to be 
applied to enhance the quality of interested points and to 


remove the noise. There are basically two kinds of image 
enhancement techniques: spatial domain methods and 
frequency-domain methods. Spatial domain methods directly 
deal with image pixels. The pixel values are manipulated to 
achieve desired enhancement of image. In frequency domain 
methods, the image is first transferred into frequency domain 
(i.e., Fourier transform of the image is first computed). Then 
all enhancement operations are performed and finally inverse 
Fourier transforms is performed to get the resultant image. 

In typical visual inspection of tiles of the reactor vessel, a 
sector of the entire surface is scanned from left to right in 
top-bottom direction. For simulation sake, tiles are 
represented with rectangular grids drawn on a paper and the 
paper is affixed to a cylindrical tank as shown in Fig. 7. 



Fig. 7 Simulated set-up of vessel tile 


Some of the grids are having a point (crack) or vertical line 
(crack) or inclined line (cracks) etc. as shown in Fig. 8. The 
task is to predict the nature of the crack first using visual 
image processing. 



Fig. 8 Tile having inclined crack 


The image processing operation is planned through 
Lab VIEW software as shown in Fig. 9. 



Vision 
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Vision 

builder 


Fig. 9 Real time maintenance 
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V. Conclusion 

In this paper, some outlines of the image-based vision 
servoing approach with a redundant 7 -DOF manipulator 
possessing eye-in-hand configuration have been briefed-out. 
The kinematics of the manipulator, vision system employed 
were explained. The simulated environment of wall tiles of 
reactor vessel was described. As a future scope the camera has 
to be attached to the end of arm and the captured images are to 
be processed for prediction of type of fault and the gripper 
tactile sensing issues as well the faulty tile removal matters 
will be considered. 


Appendix 

Derivation of inverse kinematics of the present manipulator 
is based on the derivation of the inverse kinematics of a 
PUMA 560 robot. Rotation of first rotational axis 0 2 is 
obtained by writing in following form: 


I 2 TI-Y 7 TI =[ 2 3 T][ 3 4 T][ 4 5 T][ 5 6 T][ 6 v T] = [° 2 T T 1 [T] (Al) 


where, [T] is the actual orientation and position of 
end-effector given by: 


[T]= 
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n 


n 
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O 


O 


X 
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X 


a 


a 


X 


y 


a 


0 0 0 


Px 

Py 

Pz 

1 


(A2) 


Equating the (2,4) elements on both sides of eq.(Al), we get 

-s 2 p x +c 2 (d r p z )=0 (A3) 

which gives 

0 2 =atan2(d r p x .p z ) (A4) 

6 2 =7i+ atan2(d r p x .p z ) (A5) 

When 0 2 is known the transform [^T(d x , ^ 2 )] * s fully 

defined. Rotation 0 4 is obtained by equating elements (1,4) 
and (3,4) on both sides of eq.(Al): 
c 2 p x +s 2 (di-p z ) =a 4 C 3 C 4 +a 5 S 3 S 4 +a 4 C 3 +a 3 (A6) 

p y -d 2 =-a 5 s 3 c 4 -a 5 C 3 S 4 -a 4 C 3 (A7) 

These equations give two sets of 0 4 . The rotation 0 3 can be 
obtained by writing 

[ 4.T] _1 [ yT] =[ 4 5 T][ 5 6 T][ 6 7 T] (A8) 

Equating elements (1,4) and (3,4) from both sides of eq.(A8), 
we get an expression of the form: 

63 + 64 = atan2(K 1? K 2 ) (A9) 

Since 4 combination of solutions of 0 2 and 64 exists, 63 will 
have 4 possible solutions. The process is continued for 0 5 
Finally, a given wrist position can be achieved by 4 

combinations of the 4 joint rotations 0 2 , 0 3 , 64 and 0 5 . The 
pitch and roll angles of the wrist 0 6 and 67 are obtained by 
equating the terms in rotational matrices. 
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